Unsupervised PCFG Induction for Grounded Language Learning with Highly Ambiguous Supervision
نویسندگان
چکیده
“Grounded” language learning employs training data in the form of sentences paired with relevant but ambiguous perceptual contexts. Börschinger et al. (2011) introduced an approach to grounded language learning based on unsupervised PCFG induction. Their approach works well when each sentence potentially refers to one of a small set of possible meanings, such as in the sportscasting task. However, it does not scale to problems with a large set of potential meanings for each sentence, such as the navigation instruction following task studied by Chen and Mooney (2011). This paper presents an enhancement of the PCFG approach that scales to such problems with highly-ambiguous supervision. Experimental results on the navigation task demonstrates the effectiveness of our approach.
منابع مشابه
Generative Models of Grounded Language Learning with Ambiguous Supervision
“Grounded” language learning is the process of learning the semantics of natural language with respect to relevant perceptual inputs. Toward this goal, computational systems are trained with data in the form of natural language sentences paired with relevant but ambiguous perceptual contexts. With such ambiguous supervision, it is required to resolve the ambiguity between a natural language (NL...
متن کاملExploiting social information in grounded language learning via grammatical reductions
This paper uses an unsupervised model of grounded language acquisition to study the role that social cues play in language acquisition. The input to the model consists of (orthographically transcribed) child-directed utterances accompanied by the set of objects present in the non-linguistic context. Each object is annotated by social cues, indicating e.g., whether the caregiver is looking at or...
متن کاملNatural Language Grammar Induction Using a Constituent-Context Model
This paper presents a novel approach to the unsupervised learning of syntactic analyses of natural language text. Most previous work has focused on maximizing likelihood according to generative PCFG models. In contrast, we employ a simpler probabilistic model over trees based directly on constituent identity and linear context, and use an EM-like iterative procedure to induce structure. This me...
متن کاملPCFG Induction for Unsupervised Parsing and Language Modelling
The task of unsupervised induction of probabilistic context-free grammars (PCFGs) has attracted a lot of attention in the field of computational linguistics. Although it is a difficult task, work in this area is still very much in demand since it can contribute to the advancement of language parsing and modelling. In this work, we describe a new algorithm for PCFG induction based on a principle...
متن کاملUnsupervised Learning of Probabilistic Context-Free Grammar using Iterative Biclustering (Extended Version)
This paper presents PCFG-BCL, an unsupervised algorithm that learns a probabilistic context-free grammar (PCFG) from positive samples. The algorithm acquires rules of an unknown PCFG through iterative biclustering of bigrams in the training corpus. Our analysis shows that this procedure uses a greedy approach to adding rules such that each set of rules that is added to the grammar results in th...
متن کامل